Search CORE

282 research outputs found

Fast deterministic processor allocation

Author: Hagerup T.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1992
Field of study

Interval allocation has been suggested as a possible formalization for the PRAM of the (vaguely defined) processor allocation problem, which is of fundamental importance in parallel computing. The interval allocation problem is, given

n

nonnegative integers

x_1,\ldots,x_n

, to allocate

n

nonoverlapping subarrays of sizes

x_1,\ldots,x_n

from within a base array of

O(\sum_{j=1}^n x_j)

cells. We show that interval allocation problems of size

n

can be solved in

O((\log\log n)^3)

time with optimal speedup on a deterministic CRCW PRAM. In addition to a general solution to the processor allocation problem, this implies an improved deterministic algorithm for the problem of approximate summation. For both interval allocation and approximate summation, the fastest previous deterministic algorithms have running times of

\Theta({{\log n}/{\log\log n}})

. We also describe an application to the problem of computing the connected components of an undirected graph

MPG.PuRe

On a compaction theorem of ragde

Author: Hagerup T.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1991
Field of study

Ragde demonstrated that in constant time a PRAM with

n

processors can move at most

k

items, stored in distinct cells of an array of size

n

, to distinct cells in an array of size at most

k^4

. We show that the exponent of 4 in the preceding sentence can be replaced by any constant greater than~2

OPUS Augsburg

MPG.PuRe

Optimal parallel string algorithms: sorting, merching and computing the minimum

Author: Hagerup T.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1993
Field of study

We study fundamental comparison problems on strings of characters, equipped with the usual lexicographical ordering. For each problem studied, we give a parallel algorithm that is optimal with respect to at least one criterion for which no optimal algorithm was previously known. Specifically, our main results are: % \begin{itemize} \item Two sorted sequences of strings, containing altogether

n

~characters, can be merged in

O(\log n)

time using

O(n)

operations on an EREW PRAM. This is optimal as regards both the running time and the number of operations. \item A sequence of strings, containing altogether

n

~characters represented by integers of size polynomial in~

n

, can be sorted in

O({{\log n}/{\log\log n}})

time using

O(n\log\log n)

operations on a CRCW PRAM. The running time is optimal for any polynomial number of processors. \item The minimum string in a sequence of strings containing altogether

n

characters can be found using (expected)

O(n)

operations in constant expected time on a randomized CRCW PRAM, in

O(\log\log n)

time on a deterministic CRCW PRAM with a program depending on~

n

, in

O((\log\log n)^3)

time on a deterministic CRCW PRAM with a program not depending on~

n

, in

O(\log n)

expected time on a randomized EREW PRAM, and in

O(\log n\log\log n)

time on a deterministic EREW PRAM. The number of operations is optimal, and the running time is optimal for the randomized algorithms and, if the number of processors is limited to~

n

, for the nonuniform deterministic CRCW PRAM algorithm as we

MPG.PuRe

Improved parallel integer sorting without concurrent writing

Author: Albers S.
Hagerup T.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1994
Field of study

We show that

n

integers in the range 1 \twodots n can be stably sorted on an \linebreak EREW PRAM using \nolinebreak

O(t)

time \linebreak and

O(n(\sqrt{\log n\log\log n}+{{(\log n)^2}/t}))

operations, for arbitrary given \linebreak

t\ge\log n\log\log n

, and on a CREW PRAM using %

O(\log n\log\log n)

time and

O(n\sqrt{\log n})

O(t)

time and

O(n(\sqrt{\log n}+{{\log n}/{2^{{t/{\log n}}}}}))

operations, for arbitrary given

t\ge\log n

. In addition, we are able to sort

n

arbitrary integers on a randomized CREW PRAM % using %

O(\log n\log\log n)

time and

O(n\sqrt{\log n})

operations within the same resource bounds with high probability. In each case our algorithm is a factor of almost

\Theta(\sqrt{\log n})

closer to optimality than all previous algorithms for the stated problem in the stated model, and our third result matches the operation count of the best known sequential algorithm. We also show that

n

integers in the range 1 \twodots m can be sorted in

O((\log n)^2)

time with

O(n)

operations on an EREW PRAM using a nonstandard word length of

O(\log n \log\log n \log m)

bits, thereby greatly improving the upper bound on the word length necessary to sort integers with a linear time-processor product, even sequentially. Our algorithms were inspired by, and in one case directly use, the fusion trees of Fredman and Willard

CiteSeerX

MPG.PuRe

Fast integer merging on the EREW PRAM

Author: Hagerup T.
Kutylowski M.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1992
Field of study

We investigate the complexity of merging sequences of small integers on the EREW PRAM. Our most surprising result is that two sorted sequences of

n

bits each can be merged in

O(\log\log n)

time. More generally, we describe an algorithm to merge two sorted sequences of

n

integers drawn from the set

\{0,\ldots,m-1\}

O(\log\log n+\log m)

time using an optimal number of processors. No sublogarithmic merging algorithm for this model of computation was previously known. The algorithm not only produces the merged sequence, but also computes the rank of each input element in the merged sequence. On the other hand, we show a lower bound of

\Omega(\log\min\{n,m\})

on the time needed to merge two sorted sequences of length

n

each with elements in the set

\{0,\ldots,m-1\}

, implying that our merging algorithm is as fast as possible for

m=(\log n)^{\Omega(1)}

. If we impose an additional stability condition requiring the ranks of each input sequence to form an increasing sequence, then the time complexity of the problem becomes

\Theta(\log n)

, even for

m=2

. Stable merging is thus harder than nonstable merging

MPG.PuRe

Succinct Indexable Dictionaries with Applications to Encoding $k$ -ary Trees, Prefix Sums and Multisets

Author: Fich F. E.
Grossi R.
Hagerup T.
Hagerup T.
Jansson J.
Munro J. I.
Paul W. J.
Rajeev Raman
Raman R.
Raman V.
Srinivasa Rao Satti
Venkatesh Raman
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/05/2007
Field of study

We consider the {\it indexable dictionary} problem, which consists of storing a set

S \subseteq \{0,...,m-1\}

for some integer

m

, while supporting the operations of \Rank(x), which returns the number of elements in

S

that are less than

x

x \in S

, and -1 otherwise; and \Select(i) which returns the

i

-th smallest element in

S

. We give a data structure that supports both operations in O(1) time on the RAM model and requires

{\cal B}(n,m) + o(n) + O(\lg \lg m)

bits to store a set of size

n

, where {\cal B}(n,m) = \ceil{\lg {m \choose n}} is the minimum number of bits required to store any

n

-element subset from a universe of size

m

. Previous dictionaries taking this space only supported (yes/no) membership queries in O(1) time. In the cell probe model we can remove the

O(\lg \lg m)

additive term in the space bound, answering a question raised by Fich and Miltersen, and Pagh. We present extensions and applications of our indexable dictionary data structure, including: An information-theoretically optimal representation of a

k

-ary cardinal tree that supports standard operations in constant time, A representation of a multiset of size

n

from

\{0,...,m-1\}

{\cal B}(n,m+n) + o(n)

bits that supports (appropriate generalizations of) \Rank and \Select operations in constant time, and A representation of a sequence of

n

non-negative integers summing up to

m

{\cal B}(n,m+n) + o(n)

bits that supports prefix sum queries in constant time.Comment: Final version of SODA 2002 paper; supersedes Leicester Tech report 2002/1

arXiv.org e-Print Archive

Crossref

Fast Breadth-First Search in Still Less Space

Author: D Angluin
G Barnes
N Banerjee
N Banerjee
O Reingold
T Hagerup
T Hagerup
WJ Savitch
Publication venue
Publication date: 09/01/2019
Field of study

It is shown that a breadth-first search in a directed or undirected graph with

n

vertices and

m

edges can be carried out in

O(n+m)

time with

n\log_2 3+O((\log n)^2)

bits of working memory

arXiv.org e-Print Archive

OPUS Augsburg

Crossref

Matching Subsequences in Trees

Author: D. Harel
H. Yang
P. Kilpeläinen
P. Zezula
R.A. Baeza-Yates
T. Hagerup
T. Schlieder
W. Chen
Publication venue
Publication date: 01/01/2006
Field of study

Given two rooted, labeled trees

P

and

T

the tree path subsequence problem is to determine which paths in

P

are subsequences of which paths in

T

. Here a path begins at the root and ends at a leaf. In this paper we propose this problem as a useful query primitive for XML data, and provide new algorithms improving the previously best known time and space bounds.Comment: Minor correction of typos, et

arXiv.org e-Print Archive

CiteSeerX

Crossref

Online Research Database In Technology

A Lower-Bound for the Emulation of PRAM Memories on Processor Networks

Author: Hagerup T.
Publication venue: Academic Press.
Publication date: 01/01/1995
Field of study

AbstractWe show a lower bound of Ω(min{log m, √n}) on the slowdown of any deterministic emulation of a PRAM memory with m cells and n I/O ports on an n-processor bounded-degree network. The bound is weak; unlike all previous bounds, however, it does not depend on the unnatural assumption of point-to-point communication which says, roughly, that messages in transit cannot be duplicated by intermediate processors. For m sufficiently large relative to n, the new bound implies the optimality of a simple emulation on a mesh-of-trees network

OPUS Augsburg

Elsevier - Publisher Connector

MPG.PuRe

Succinct Partial Sums and Fenwick Trees

Author: AC Yao
M Patrascu
ML Fredman
PF Dietz
PM Fenwick
R Raman
T Hagerup
WK Hon
Publication venue
Publication date: 01/01/2017
Field of study

We consider the well-studied partial sums problem in succint space where one is to maintain an array of n k-bit integers subject to updates such that partial sums queries can be efficiently answered. We present two succint versions of the Fenwick Tree - which is known for its simplicity and practicality. Our results hold in the encoding model where one is allowed to reuse the space from the input data. Our main result is the first that only requires nk + o(n) bits of space while still supporting sum/update in O(log_b n) / O(b log_b n) time where 2 <= b <= log^O(1) n. The second result shows how optimal time for sum/update can be achieved while only slightly increasing the space usage to nk + o(nk) bits. Beyond Fenwick Trees, the results are primarily based on bit-packing and sampling - making them very practical - and they also allow for simple optimal parallelization

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Online Research Database In Technology